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Abstract— Let C = {xi,...,xjv} C {0,1}" be an [n,N] 
binary error correcting code (not necessarily linear). Let e G 
{0, 1}" be an error vector. A codeword x G C is said to be 
disturbed by the error e if the closest codeword to x © e is 
no longer x. Let be the subset of codewords in C that 
are disturbed by e. In this work we study the size of A^ in 
random codes C {i.e. codes in which each codeword Xi is chosen 
uniformly and independently at random from {0, 1}"). Using 
recent results of Vu [Random Structures and Algorithms 20(3)] 
on the concentration of non-Lipschitz functions, we show that 
\Ae\ is strongly concentrated for a wide range of values of 
and ||e||. 

We apply this result in the study of communication channels we 
refer to as oblivious. Roughly speaking, a channel VF(y|x) is said 
to be oblivious if the error distribution imposed by the channel 
is independent of the transmitted codeword x. For example, the 
well studied Binary Symmetric Channel is an oblivious channel. 

In this work, we define oblivious and partially oblivious 
channels and present lower bounds on their capacity. The 
oblivious channels we define have connections to Arbitrarily 
Varying Channels with state constraints. 

I. Introduction 

For a parameter n, a general (not necessarily memoryless) 
binary communication channel W for block length n is a 
probability distribution over {0,1}" x {0,1}". Namely W 
is defined by the conditional probabilities W^(y|x) that y G 
{0, 1}" is received when x G {0, 1}" is transmitted. 

An [n, N] binary block code C is defined by a codebook of 
N codewords C = {xi, . . .xat} in {0, 1}" corresponding to 
messages {1, . . . , TV} = [N] and a decoder (j) : {0, 1}" [N]. 
The probability of error for message i, when C is used on a 
channel W is e{i) = '£y■.^(y)^^ W^(y|x»). 

An [n, N] code C is said to allow communication at rate R 
over the channel W with (average) error e > if > 2^" 
and e = jj X^ili ^(*) ^- ^] code C is said to allow 

communication at rate R over a family of channels W with 
error e if for every W ^ W the code C allows communication 
at rate R over W with error e. Rate R is an achievable rate 
for the family W if for every e > 0, i5 > and every 
sufficiently large n there exists an [n, N] code C such that 
C allows communication at rate > R — 6 over the family W 
with error at most e'. The maximum achievable rate is called 
the capacity of the family W, and is denoted by C{W). 

When considering the capacity of a family of channels 
W, one must address the design of error correcting codes 

' In the study of communication over families of channels it is also common 
to address the maximum error e = maxi e{i) instead of e; and the rate 
achievable when using a distribution over codes (random coding) instead of 
a deterministic code C as above. These models are briefly addressed in the 
Appendix. 



which allow communication under the uncertainty of which 
channel W is actually used from the family W. Intuitively, 
this corresponds to the design of codes which allow com- 
munication in an adversarial jamming model in which an 
entity Z controlling the channel is assumed to act maliciously 
within the limits of W. We will adapt this interpretation in the 
discussions throughout this work. 

A. This work 

Several families of channels have been studied over the 
last few decades (for a nice survey on communication under 
channel uncertainty see [10]). For a constant p G (0,1/2) 
a p-channel W is a. channel for which l/l^(y|x) = if the 
Hamming^ distance between x and y is greater than pn. In 
words, a p-channel can only change at most pn entries of 
x. The parameter p may be viewed as the amount of power 
that can be used by the channel when imposing an error. In 
this work we study the capacity of various families of binary 
p-channels. 

A natural starting point is the extensively studied family Wp 
of all binary p-channels. The capacity of Wp is a long standing 
open problem. There is a strong connection between codes C 
that allow communication over Wp and the minimal distance 
of C. Namely, C(Wp) equals the maximum (asymptotic) rate 
of [n, N] block codes with minimum distance greater than 2pn 
(a detailed proof appears in the Appendix). The latter rate is 
not known. It is known that this rate is bounded away from 
1 — H{p) {e.g. [2], [13], [15]), while the currently best known 
lower bound stands on 1 — H{2p) (Gilbert- Varshamov [7], 
[16]). 

We will not study the capacity of Wp, rather we turn to 
study certain subfamilies W C Wp. Consider the adversarial 
model discussed above, in which an adversarial entity Z 
may choose which channel ly G W to use based on the 
code C shared by the sender and receiver. In the case of 
communication over Wp this adversarial entity Z is very 
powerful as it can choose any p-channel W and tailor the 
error it imposes to fit not only the code C in use but also 
the codeword x transmitted. Indeed, Z can use a channel 
iy(y|x) G Wp in which the error distribution imposed by 
the channel strongly depends on the transmitted codeword x. 

In this work we study scenarios in which Z is limited in its 
dependence on x. Specifically, we study the scenario in which 
the error imposed by Z is oblivious or partly oblivious of the 

^Let X — X\X2 • • • be an element in {0, 1}^^. The Hamming weight 
||x|| is defined to be the number of positions i in which 7^ 0. 



codeword x transmitted. For example, if Z always imposes 
exactly the same distribution over errors, no matter which 
codeword x is sent, then Z is said to be completely obUvious 
of X. A well studied oblivious channel is the Binary Symmetric 
Channel with cross over probability p. We denote this channel 
by WbsCj, ■ Indeed, no matter which codeword x is transmitted 
the error imposed by WssCp follows the same distribution. 
In this work we define and study famiUes of channels with 
varying degrees of obliviousness. 

B. Oblivious channels 

We start by giving a slightly different (but equivalent) 
definition of a binary channel W. Instead of defining W 
in terms of the conditional probabilities TF(y|x), one may 
define W in terms of the conditional probabilities H^(e|x); 
where e e {0, 1}" is the error imposed by the channel W. 
Specifically, in this setting y = x © e. For example, by our 
definitions, a p-channel W is a channel for which VF(e|x) =0 
for every e of Hamming weight above pn. Let 11 be the set of 
distributions over errors e G {0, 1}". In this setting, a channel 
W may be viewed as a function from x G {0, 1}" to the 
set n. Now we are ready to define 7-oblivious channels for 

7e [0,1]. 

Roughly speaking, a channel W : {0, 1}" — > 11 is said 
to be oblivious if it is a constant function. In this case we 
will say that W is 1-oblivious. The obliviousness of a channel 
is determined by the size of its image. Namely, channels W 
with image size at most 2" will be referred to as 0-oblivious 
channels (thus any channel is 0-oblivious). For 7 G [0, 1] 
channels with image size at most 2'^^"'*'^" will be referred 
to as 7-oblivious. 

Definition 1.1: A channel W with block length n is 7- 
obhvious if there is a 2^^"''')" sized family of distributions 
TT = {tti, . . . , 7r2(i-7)r.} C n, such that for every x G {0, 1}" 
the marginal distribution T4^(-|x) over e is in the set tt. A 
family of channels W is 7-oblivious if for each W G W, W 
is 7-oblivious. 

For example, the Binary Symmetric Channel is 1-oblivious, 
as WB5C'j,(e|x) is completely independent of x; and the 
family Wp is 0-oblivious (and not 7-oblivious for any 7 > 0). 
Let Wp -y be the family of all p-channels that are 7-oblivious. 
In this work we study the capacity of Wp,-y for various values 
of p and 7. The main result of this work can be summarized 
in the following Theorem. 

Theorem 1: For any p G [0, 1/2) and any 7 G ^ 1 

C{Wp,^) > 7 - H{p). 
A few remarks are in place. It is not hard to verify (detailed 
proof appears in the Appendix) that for 7 = 1, Theorem 1 is 
tight. Namely, C{Wp.i) = 1 - H{p) (the capacity of Wbsc^ 
[14]), this follows from the fact that WusCp is a 1-oblivious 
channel which in essence^ is also a p-channel. It also holds that 
C(>Vp,T,) > C{Wp) > 1 - H{2p). A simple calculation shows 

'Notice that WssCp is not a p-chaimel, however the error it imposes is 
expected to be of Hamming weight pn. 



that 1 — H{2p) may be above the bound of Theorem 1 only for 
very small p < 0.07 and 7 G ^±f^, 1 - H{2p) + H{p)^ . 

The study of C{Wp^j) arises when considering communica- 
tion in an adversarial jamming model in which the jammer Z 
is limited in resources. Primarily, we restrict the jammer to flip 
at most a f^-fraction of the bits transmitted, which corresponds 
to a power constraint imposed on Z. In addition, we limit the 
jammer's view of the transmitted codeword. This is obtained 
by forcing the jarmner to use a channel W which can not 
properly differentiate between different codewords x. Namely, 
by restricting W to impose its error based on only a small 
number of possible error distributions, it must be the case 
that the exact same distribution is used on large portions of 
codewords. 

An alternative (but problematic) definition to 7-obhvious 
channels W that may come in mind is one in which we restrict 
maxx I{X-, Z) to be at most (1 — 7)71. Here X represents 
any distribution over codewords transmitted and Z denotes 
the error imposed by the channel. The random variables X 
and Z are jointly distributed according to T/K(e|x). There 
are various connections between the suggested definition and 
the original one given in Definition 1.1. However, they are 
not equivalent, and roughly speaking, the suggested definition 
implies a discontinuous capacity function at the point 7=1. 
A detailed discussion appears in the Appendix. 

C. Previous results and connection to AVC's 

To the best of our knowledge, 7-oblivious p-channels for 
general 7 G [0, 1] have not been addressed in the past. For 
the special case 7 = 1, as we state shortly, there is a strong 
connection between 1 oblivious p-channels and so called 
arbitrarily varying channels (AVC) with state constraints. 

A (discrete memoryless) arbitrarily varying channel [3] of 
block length n is a family of channels W defined by a set of 
states S and a set of channels <S = {Ws(y|x)|s G S} of block 
length 1 (in the binary case x and y are in {0, 1}). Specifically, 
the family W g that corresponds to S consists of the channels 
{Ws|s G S*"} defined by VKs(y|x) = Hf^^VK^^ (y,|a;,). In the 
above, x = xi, a::„; y j/i, ...,?/„; and s = si, ... , s„. If 
we associate with each state s G 5* a cost (-{s), an AVC family 
with state constraint p is the family of channels Ws G 
for which ^ YJi=i ^(■^i) < P- 

Consider the binary 1-block channels and Wi defined 
by Ws{v\x) = 1 iff {x + s = y) modulo 2. Let £{s) = s for 
s G {0, 1}. Let W* denote the AVC family defined by Wq 
and Wi with state constraint p. The families Wp.i and W* 
are closely related and it holds that C(Wp,i) = C(W*). 

The capacity of AVC with state (and also input) constraints 
was studied extensively in the works of Csiszar and Narayan 
[4], [5]. Using proof techniques that build strongly upon 
the method of types, Csiszar and Narayan show that the 
capacity of CiW*) is 1 — H{p). Thus, proving Theorem 1 
for the case 7=1. The proof presented in this work differs 
substantially from the proofs of Csiszar and Narayan. Namely, 
our proof technique is combinatorial in nature and is based on 
a relatively new "strong concentration inequality" of [17]. This 



inequality and its application in the context of coding theory 
may be of independent interest. 

For 7 < 1, 7-oblivious channels were not defined or 
discussed in [4], [5]. However, a careful examination of their 
proof techniques yields an imphcit bound on the capacity of 
C{yVp,^) for large values of 7. Namely, it can be shown using 
the proof techniques that appear in [4] that C(>Vp,-y) > 1 — 
-ff (j?) — 30(1 — 7). For comparison using our proof techniques 
we show that C(yVp,^) > 1 - H{p) - (1 - 7). 

D. Proof Techniques, random codes, and list decodable codes 

To prove the lower bound of Theorem 1 we need to show the 
existence of high rate codes C which enable communication 
over 7-oblivious p-channels. We first note that no hnear code 
will suffice. Roughly speaking, this follows from the fact 
that each codeword in a linear code C has exactly the 
same "neighborhood structure". Thus, when a hnear code 
is used, the problem of communicating over the obUvious 
or partially obUvious families Wj,,^ is equivalent to that of 
communication over Wp (a detailed proof appears in the 
Appendix). We thus turn to study codes which are not linear. 
A natural candidate is a code C in which the codewords 
C = {xi,...,xjv} are chosen completely at random, {i.e. 
a code in which each codeword is chosen uniformly and 
independently from {0, 1}"), and 4> is the Nearest Neighbor 
decoder. Let e € {0, 1}" be an error vector of Hamming 
weight at most pn. A codeword x is said to be disturbed 
by the error e if the closest codeword to x ® e is no longer x. 
Let Ae = Ae{C) be the subset of codewords x in C that 
are disturbed by e. In Section 11 we show that C enables 
communication over all 7-oblivious p-channels if for every 
error e of Hamming weight at most pn the size of A^, is 
relatively small. 

Hence, it suffices to analyze the size of over random 
codebooks C. Specifically we are interested in showing that 
with positive probability Ae is small for every error e of 
weight at most pn. Let i? = 7 — H{p). It is straightforward 
to verify that for a fixed error e, the expected size of A^ 
taken over random C = {xi, . . . , xl2«"J } is relatively small. 
Hence it is left to show that with high probability \Ae\ does 
not deviate significantly from its expectation. Indeed if this is 
the case, a simple union bound will imply our assertion. 

Strong concentration (or large deviation) inequalities have 
been extensively studied. The usual way to prove such in- 
equalities is via the Azuma or Talagrand inequalities (e.g. [1]). 
These inequalities work very well when the random variable 
at hand has a small Lipschitz coefficient. In our case the 
Lipschitz coefficient of \Ae\ is defined by the maximum of 
|v4e(C)| - |Ae(C")|| where C and C are two codebooks 
which differ only in a single codeword. It is not hard to 
verify that the Lipschitz coefficient of \Ae\ may be very large. 
However, we show that for most pairs C and C" as above, 
the difference ||^e(C)| - |^e(C')|| is relatively small and 
is bounded by the list decoding quality of C (the maximal 
number of codewords in C which are included in a Hamming 
ball of radius pn). With this in mind, we are able to use 



a recent result of Vu [17] on the concentration of random 
variables with large worst case Lipschitz coefficients but small 
average case coefficients. The application of the framework 
suggested in [17] to our random variable \Ae\ is somewhat 
involved and can be viewed as the main technical contribution 
of this paper. 

There are other proof techniques which are common in the 
study of probabilistic combinatorics. For example, so called 
"correlation inequalities" {e.g. [1]) are often used to analyze 
the probabihty of the intersection of many events. We would 
like to note that such inequalities may also be used to study 
the problem phrased above, however they only yield results 
for small values of p that satisfy H{p) < i, as in this case 
the number of events considered is relatively small. 

Definition 1.2: Let il[n, N] be the distribution over [n, TV] 
codebooks C = {xi, . . . , xat} in which each codeword in C 
is chosen uniformly and independently from {0, 1}". 

Definition 1.3: For x e {0, 1}" and integer r, let B{r,x) 
be the Hamming ball of radius r centered at x. 

Definition 1.4: For a given codebook C = {xi, . . . ,XAr}, 
and error e e {0,1}", let Ae{C) = {xi\3j i s.t. x^ e 
i3(||e||,Xi ©e)}. When the reference codebook C is clear we 
will denote Ae{C) by Ae. 
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Theorem!: Let p € [0,1/2). Let 7 e 3 
5 > be any sufficiently small constant. Let R = j—H{p)—5. 
Let n be sufficiently large. Let e be any error vector in 
{0, 1}" of Hamming weight at most pn. Then Pr[|Ae| — 
E(|Ae|) > 2(^(p)+2«-i)"] < 2-2". Here the probabihty is 
over n[n, [2-^"]]. 

The remainder of this work is organized as follows. In 
Section II we present some preliminaries on the distribution 
0[n, AT] and on obhvious channels. In Section III we present 
the proof of Theorem 2 (which will imply Theorem 1). 

II. Preliminaries 

For any integer i, let [i] denote the set {l,2,...,i}. Let 
H{x) = — xlogja; — (1 — x)log2(l — x) be the standard 
(binary) entropy function. For a codebook C = {xi, . . . ,xn}, 
the corresponding Nearest Neighbor decoder is the decoder 
which on input y £ {0, 1}", returns the index of the closest 
codeword x^ in C to y. For uniqueness, we will assume ties 
are broken by the natural lexicographic ordering on {0, 1}". 
To simplify notation, for any R e [0, 1] and integer n, we 
assume throughout that 2^" is integer. 

Definition 2.1 (List decodability): An [n,N] binary code- 
book C is said to be [£,p] list decodable iff \CnB{pn, y)| < ^ 
for any y G {0, 1}". 

We first analyze the list decoding properties of random 
codes. The lemma that follows has appeared in various forms 
in the past (e.g. [6], [18]). Full proof is given in the Appendix. 

Lemma 2.1: Let R < 1 — H{p). Let n be sufficiently large. 
Let C be a random codebook in 17[n,2^"]. With probability 
1 - 2-"', C is [12n2,p] list decodable. 

Let e be an error in {0, 1}". Recall the definition of Ae{C) 
from Definition 1.4. We now define an alternative sufficient 



condition for a code C to allow communication over 7- 
oblivious p-channels. We wiU use this sufficient condition 
throughout our work. 

Lemma 2.2: An [n, 2^"] codebook C with the Nearest 
Neighbor decoder cf) allows communication over Wp,^, within 
error e if for every error e e B{pn, 0) it is the case that \Ae\ 
is at most £2(^-(i-t))". 

Proof: Let C = {xi , . . . , X2Hr, } be a codebook in 
which for every error e € B(pn, 0) it is the case that \Ae\ 
is at most £2(^~(^~'''))". Let (p be the Nearest Neighbor 
decoder. Let iV = 2^". Let W be a channel in Wp.-y. 
By Definition 1.1 and the fact that is a p-channel there 
exists a family of distributions tt = {tti, . . . , 7r2(i--,)n} over 
S(pn,0) of size 2(1-'')" such that for every x e {0,1}" 
the marginal distribution H^( |x) over e is in the set tt. For 
i G [2(^~''')"] let Xi be the subset of codewords x in C for 
which VF( |x) = 7ri(-). We show that C allows conmiunication 
over W with error at most e. 
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III. Proof of Theorem 2 

In what follows we prove Theorem 2. We use the notation 
outlined in the statement of Theorem 2. Let N — 2^", and 
M = 2". We occasionally identify codewords in C with their 
corresponding messages in [N] and elements in {0, 1}" with 
integers in [M]. We first analyze the expected size of 
over random codebooks (fi[n, 2^"]). For technical reasons, 
throughout this section we treat codebooks C as ordered 
sets (xi,...,XAr) (instead of unordered sets). Accordingly, 
we change the definition of r2[n, 2^"] to be the uniform 
distribution over ordered codebooks. 

Lemma 3.1: E[|Ae|] < 2(^(p)+2^-i)". 

Proof: For i S [N] let be the indicator of the event 
"xi e ^e". Hence, E[|^e|] = Yji^i^'e]- We fiim to analyze 
E[Ag] for any given i. This value is exactly the probability 
that the ball centered at Xj © e of radius ||e|| includes an 
additional codeword Xj. For a fixed j ^ i, this probability 
is at most 2^(p)"/2". Here we use the fact that the size of 
a Hamming ball of radius pn is bounded by 2^^^^" [12]. 
Thus, using the union bound on a\l j ^ i £ [N], the value of 
E[Al\ is bounded by 2^(p)"+^"/2". This in turn implies that 
E[|Ae|] < 2(«(p)+2«-i)». ■ 

We now turn to show that the size of Ae is 
strongly concentrated. The Lipschitz coefficients of 
Ae can be described by the following function A. 
For any [n,N] codebook C = (xi, . . . ,X7v), any 
i G [N], and any x e {0,1}" let A(i,x,C) = 
I E(|Ae| : Xi,...,Xi_i,Xj =x) -E(|Ae| : Xi, . . . , Xj_i) | . 



The expectation above is over V,[n,N]. Given a small global 
upper bound on the value of A one can prove the tight 
concentration of |^e| using Azuma's inequality. However, 
it is not hard to verify that A does not have a small global 
bound in the case under study (A can be as large as a constant 
fraction of N). Nevertheless, as we will show, the value of A 
is small on average and lends itself to the framework outlined 
in [17], implying the desired concentration. Details follow. 

Let i = 12n^ be the Ust decoding parameter from 
Lemma 2.1. Using a slight change of notation which fits our 
needs, in Lemma 3.1 of [17] it is shown that: 

Lemma 3.2 fLeimna 3.1 [17]): Let 

N 

Pi = ^Pr[3xG {0,1}" s.t. A(i,x,C) >£ + 3], 
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A(i,x,C) > N{l + i) 



For any A < 4A'' 

[|Ae| -E(|Ae|) > yAiV(^ + 3)2] <2e-^/*+pi+p2. 
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All probabilities and expectation are over Vl[n,N]. 

Thus, to use the concentration results of [17] we must bound 
Pi and p2 defined above. We start by computing the value of 
A(i,x, C) for a given [n, N] codebook C = (xi, . . . ,XAr). 
We use the following definitions. For a codebook C and an 
index i G [N] let C\i be the set of ordered [n, N] codebooks 
that agree with C on the first i codewords, namely a codebook 
C" = (x; , . . . , x'^ ) e C I , iff Vj < i it holds that x^ = x' . For 
an [n, N] codebook C = (xi, . . . , x^r), an index i G [A^], and 
X G {0, 1}" let C(i,x) be the codebook that agrees with C 
on all but the i'th codeword, and on the I'th codeword equals 
X. Recall that £ = 12n^. An [n, N] codebook C is said to be 
typical if it is [I, p] Ust decodable (the rest are referred to as 
codebooks which are not typical). Denote the set of typical 
[n, N] codebooks by T and codebooks which are not typical 
by T'^. By Lemma 2.1, at most a fraction of 2^" (ordered) 
codebooks are not typical {i.e. \T''\ < M^2-"'). Notice that 
the size of C|i_i is M^~'+^. Our definitions now imply that 
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We now analyze the value of ||Ae(C(i, x))| — |^e(C)|| and 
show its connection to the list decoding properties of C. 

Lemma 3.3: If an [n, TV] codebook C is typical then 
||Ae(C(i,x))| - |^e(C)|| < £ + 2. If C is not typical then 
\\A,{C{i,^))\-\A,{C)\\<N. 

Proof: For the first part of the lemma notice that if C 
is [C.,p\ list decodable then C(z,x) is [£+ l,p\ list decodable. 
Recall that a codeword Xj of C is said to be disturbed by the 
error e if x^ G (C) . The value of | (C(i , x) ) | - 1 (C) | is 
bounded by the maximum number of codewords Xj disturbed 
by the error e exclusively due to the change of x^. Namely, 
this value is bounded by |{j | ||xj © x^- © e|| < ||e||}| + 1 
(an additional value of 1 is added for the case that Xj may be 
disturbed by e). This in turn is at most |{j|xj G B{pn,Xi © 
e)}| + l<^ + 2. An analogous analysis can be done for 



|^e(C)| — |^e(C(i,x))|. The second part of the lemma follows 
from the fact that |^e| is bounded by N. ■ 
Corollary 3.1: Let T be the size of C|i_i \T. A(i, x, C) < 

M-(^-')r + £ + 2. 

We now analyze pi and p2 of Lemma 3.2. 
Lemma 3.4: pi < 2-"'MiV. 

Proof: Let i G [N]. We first note that Corollary 3.1 
implies that A(?', x, C) > £ + 3 only if the size of C|i-i \T is 
at least M^^\ Moreover, by our definitions IT"] < M^2""^ 
(recall that T" is the set of codebooks which are not typical). 
We now use these facts to prove our assertion. 

Notice that for two codebooks C and C the sets C|j_i 
and C'\i-i are either equal or disjoint. We partition the set 
of codebooks in n[n,N] to AP^^ disjoint subsets of the 
form C\i-i. Denote these subsets by Qi, . . . ,Qm'-^- Let 
a denote the number of these subsets that satisfy \flj \ 
T\ > M^-\ As these sets are disjoint and {T^l < 
M^2~" ; a is at most M'2~" . Finally, for a given i, 
Pr[3x e {0,1}" s.t. A(i,x,C) > ^ + 3] < M'^'-'^^a < 
M-(*-i)Ar2-"' < 2-" M. m 
Lemma 3.5: P2 < M2-"'. 

Proof: Consider a codebook C, and the event 
.emrJi^ii^^^C) > Ar(^ + 3)". This event 
is included in the event ^^-^ maXxg{o,i}" C*) > 

N[£ + 3). The above event holds only if the size of the set 
{i\ maxxg{o,i}'' A(i,x, C) > ^ + 3} is greater than 1. We call 
a codebook C bad if {i\ maXxg{o,i}" A(i, x, C) > £+3} 7^ 4). 
For each bad C let d{C) = i — 1 where i is the minimum 
integer in {i\ maxxe{o,i}" A(i, x, C) > i + 3}. 

We now show that the number of bad C's is less than 
j^N+i2-n ^ vvhich concludes our assertion. Consider the set 
B = Bi of bad codebooks C (the indexing of B will be clear 
shortly). Let Ci be any bad codebook, and let ii — 1 = d{Ci). 
By Corollary 3.1, the size of Ci\i^-i \ T is at least M^-'\ 
Moreover, the size of Ci\i^-i is exactly M^~*^+^. Let B2 be 
®i \ C'lln-i- Let C2 be any codebook in B2 and let d{C2) = 
i2 — 1. We now claim that the set C2\i2-i is disjoint from 
Ci|ii_i. Assume otherwise, then Ci|ii_i is strictly included 
in C2|i2-i (recall C2 ^ Cijij-i). This implies that i2 < ii and 
A(i2, X, Ci) = A(i2, X, C2); which contradicts the minimality 
of i 1 — 1 = d{Ci). Now as before, by Corollary 3.1, the size of 
C2|i2-i is at least M^"*^ and the size C2\i2-i is exactly 

We continue this process iteratively, namely at step k, we 

chose a codebook Ck G B^, and set ik — 1 = d{Ck)- As 
above we have that Ckli^-i is disjoint from Ck'\i^,-i for 
any k' 7^ k, the size of Cfc|ij^_i \ T is at least M^-'S and 
the size Ck\i^-i is exactly Af^"''=+^ We define Bfe+i to 
be B/5 \ Cfe|ij._i. We continue this process until B is entirely 
covered. Let k* be the last step of our procedure (i.e. B^.+i = 
(/)). It is not hard to verify that |B| < Y.'f^j^ 
MJ2k=iM^~'' < M\u''kl^CkU,-i\T\ < M\T% which 
concludes our assertion. ■ 
Now combining the results of Lemma 3.2, 3.4 and 3.5; and 
setting A of Lemma 3.2 to be equal to we obtain the 



assertion stated in Theorem 2. In the above, by our setting 
of parameters, notice that ^/XN(ITW < 2(^(p)+2'R-i)" 



1 



). The lower 



(here we use the fact that 7 € ^^^tM£i, 
bound of Theorem 1 now follows easily from "Theorem 2 and 
Lemma 2.2, full proof is given in the Appendix. 

IV. Conclusion 

In this work we define and study the capacity of Wp -y (the 
family of all binary 7-oblivious p-channels). Such families 
of channels arise when considering communication in an 
adversarial jamming model in which the jammer Z is limited 
in resources. We limit the jammer by both a power constraint 
and by the restriction to impose its errors based only on a 
small number of possible error distributions. For 7 = 1 such 
famihes are closely related to AVC's with state constraints, 
and it has been shown in [4], [5] that C(Wp,i) = 1 - H{p). 



We show for p < 1/2 and 7 G 



(2+mp) 



,1 



that C(>Vp,^) 



is at least 7 — H{p). For 7 = 1 our contribution is in 
our new proof technique. Roughly speaking, our proof is of 
combinatorial nature, is based on a relatively new "strong 
concentration inequality" of [17], and differs substantially 
from the proof presented in [4], [5]. For 7 S (0, 1) this work 
initiates the study of 7-oblivious channels. 
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Appendix 

A. Maximum error and Random coding 

For a channel W and a code C define the maximum error 
e = e{C,W) = maxje(z). For a family of channels W, 
let C™(W) be the capacity of W with respect to e. In this 
work we did not address C™(yVp,-y) for 7 £ [0, 1] as it 
holds that C™(Wp,^) = C"'(Wp). This follows directly by 
our definitions. 

Let C* be a distribution over [n,N] codes. C* is said to 
allow communication over Wp^^ with maximum error e if for 
each W G Wp^-y the expected error E[e(C, W)] is at most 
e (here the expectation is over C*). The random capacity 
is now defined analogously to the deterministic 
capacity used throughout the paper. In [9], [11] it is shown 
that CiWpfi) = 1 - H{p). Let n be the distribution over 
errors e G {0, 1}" in which Pr[7r = e] = - 
Let tt' be tt restricted to errors e with Hamming weight less 
than or equal to pn. Let Wtt' G Wp,i be the channel in 
which W^>i-\x.) = tt' for all x G {0,1}". It now holds 
that < ^(W^O = C'"(PF^O < 1 - H{p) (the 

last inequality is proven in Section C of this Appendix). We 
conclude, that C^W^,^) = 1 - Hip), for 7 G [0, 1]. 

B. Average vs. maximum error in Wp 

As above, for a channel W and a code C define the 
maximum error e = max, e(i). For a family of channels W, 
let C™(>V) be the capacity of W with respect to e, and C"(>V) 
be the capacity of W with respect to e. 

Lemma 1.1: C"(>Vp) = C"(Wp). 

Proof: It is clear that C"(>Vp) > C™(Wp) thus we 
prove the missing inequality. For a given < £ < 1/4 
assume the existence of an [n, N] code C = (C, ^) that allows 
communication over Wp with e < e. Let C = {xi, . . . ,xjv}. 
Let N — 2^". We will show the existence of a subset C" of 
C of size at least 2-^""! s.t. using C = (C, (^') on Wp we 
obtain e = (here (b' is the Nearest Neighbor decoder). This 
is enough to prove our assertion. 

Consider the following graph G with vertex set C and an 
edge between and x^ iff ||xi © Xj|[ < 2pn. Let M be a 
maximal matching in G, namely a maximal set of edges M 
such that every vertex in G is adjacent to at most a single edge 
in M. Consider the subgraph Gm of G in which we include 
only edges in the matching M. Let 1m be the set of vertices 
in Gm with no adjacent edges. Im is an independent set in 
G (and also in Gm)- In other words, the codebook consisting 
of codewords in Im has minimum distance 2pn + 1 and thus 



when used with the Nearest Neighbor decoder 0' on >Vp will 
have error e = 0. It is left to show that \Im\ is large. Let 
VF(y|x) be the following channel: 1) for codewords Xj with a 
corresponding codeword x^ s.t. the edge (xj,Xj) is in M, set 
VK(y|xi) = 1 where y is the center of the minimum radius 
ball in {0, 1}" including Xj and x^; 2) for the remaining x G 
{0, 1}" set W(x|x) = 1. Notice that W G Wp. It now follows 
that the average decoding error of C when communicating on 
W is ^^^j^ < e. This impUes that \Im\ > (1 - 2£)|C| > 
2fln-i_ I 

C. Upper hound on C(Wp,i) 

Let TT be the distribution over errors e G {0,1}" in which 
Pr[7r = e] = pll'=ll(l - p)""!!^!'. Let tt' be tt restricted to 
errors e with Hamming weight less than or equal to pn. Let 
Wjr be the channel in which PF^(-|x) = tt for all x G (0, 1}". 
Let Wtt' be the channel in which PFt'(-|x) = tt' for all x G 
{0,1}". Notice that W^- G Wp,i and that = Wbsc^- 
We now show that C(PF^') < 1 — H{p) (this will suffice to 
prove our assertion). Assume otherwise, namely that for R > 
1 — H{p), e < 1/4: and sufficiently large n there exists [n, 2^"] 
codes C which allow communication over W^^' within error e. 
This implies that C allows communication over W^r = Wbsc^ 
within constant error bounded away from 1. This contradicts 
a fundamental result on the e-capacity of WssCp [14]. 

D. Attempt for an alternative definition for obliviousness 

An alternative definition to 7-oblivious channels is / = 
maxjjf I{X; Z) < (1— 7)11. Here X represents any distribution 
over codewords transmitted, and Z denotes the error imposed 
by the channel. The random variables X and Z are jointly 
distributed according to Pr[X = x, Z = e] = W^(e|x). 
There are various connections between the suggested definition 
and the original one given in Definition 1.1. Namely, it is 
not hard to verify that if a channel W is 7-oblivious by 
Definition 1.1 then it is 7-oblivious by the above definition. 
The other direction holds for 7 = or 1 but is not necessarily 
true for 7 G (0, 1). For example, consider a channel VF(e|x) 
defined by a set of errors {ex} (each of Hamming weight 
at most pn) indexed by x G {0,1}": T4^(ex|x) = e + a, 
otherwise, for e ^ ex of weight at most pn, W^(e|x) = a. 
Here a is (1 — e)/Vol{pn) where Vol{pn) is the size of a 
Hamming ball of radius pn in {0, 1}". Consider the family of 
channels W consisting of all such channels W . This family is 
1 — £ oblivious by the suggested definition and only {\ — H{p)) 
- oblivious by Definition 1.1. It is not hard to verify that the 
capacity of W is that of Wp. This implies a discontinuity in the 
capacity of 7-oblivious p-channels when using the suggested 
definition at the point 7 = 1. 

E. Linear Codes 

Lemma 1.2: Let C be any [n, 2^"] linear codebook. Let 
7 G [0,1]. There exists a decoder </> such that C,<f) allow 
communication over 7-oblivious p-channels within error less 
than 1/2 iff C has minimum distance of value at least 2pn+l. 



Proof: Let C be any codebook with minimum distance of error e G B{j)n, 0) is at most 2~'^"'V olipn) < 1. This imphes 
value at least 2pn+l. Let (j) be the Nearest Neighbor decoder, the existence of an [n, 2^"] code as asserted in Theorem 1. 
Then for every p-channel W it holds that e = J2iLi = 

0, implying that C allows communication over 7-oblivious 
p-channels with error 0. 

Let C = {xi, . . . , X2H1.} be an [n, 2^"] linear codebook 
with minimum distance less than 2pn + 1. Let (f) be any 
decoder By the linearity of C this implies the existence of 
a codeword x* of weight at most 2pn (where x* 7^ 0). Let 
ei be any error in {0, 1}" of Hamming weight at most pn 
such that X* e B{pn, ei). Let 62 be x* © ei. Notice that 62 
is of Hamming weight at most pn. Notice also that for any 
codeword x it holds that x © ei = (x © x*) © 62 = x' © 62 
(here x' = x © x* is a codeword of C). Consider the set 
Ai = {xi|0(xi © ei) = i} and A2 = {xilxi © x* = 
Xj and (/)((xi © x*) © 62) — j}- The sets Ai and A2 are 
disjoint. Thus at least one of the sets is of size most 2^"/2, 
say Ai (a similar proof can be given for A2). Let W be the 
deterministic 1-oblivious p-channel for which Vx W^(ei|x) = 

1. We conclude that e. = Yld^i ^(0 — implying that 
C does not allow communication over 1-oblivious j>channels 
within error less than 1/2. As W is also a 7-oblivious channel 
for any 7 G [0, 1] we conclude our assumption. ■ 

F. List decodability of random codes 

Lemma L3: Let R G (0, 1). Let c be a sufficiently large 
universal constant. Let t = max(yo/(pn)2~"+-'^"+^, cn^). 
Let C be a random codebook in il[n,2^"]. With probability 
at least 1 - e-^/^2'', C is \i,p] Hst decodable. 

Proof: Let B be any ball of radius pn in {0,1}". 
The expected number of points in the intersection of C and 
B is E = T/o?(pn)2-"+'^". Let £ = ma^{2E, cn"^). The 
probability, for a specific ball B of radius pn, that |C n ;B| is 
less than £ (which is at least twice its expectation) is at least 
1 — e^^''^. For £ = 2E this follows by applying the Chernoff 
bound [8]. For £ = cn^ > 2E this follows by studying the 
probability that |C n S'| < £ for a larger subset B' including 
B. Thus the probability that this holds for every ball of radius 



be any sufficiently small constants. Let R = j — H{p) — 6. We 
show that for sufficiently large n there exist [n, 2^"] codes C 
which allow communication over 7-oblivious p-channels with 
error e. The decoder (j) used is the Nearest Neighbor decoder 
By Lemma 2.2 it suffices to show the existence of codebooks 
C for which |Ae(C)| is smaller than £2<.^-('^—'))^ for every 
e G B{pn, 0). Let C be a random codebook in ^l[n, 2^"]. The 
probabiUty that |Ae(C)| is greater than 2('f^(p)+2«-i)"+i for 
a specific error e G B{pn, 0) is at most 2^^". This follows 
by Theorem 2 and Lermna 3.1. By our setting of parameters 
2(H(p)+2R-i)n+i < g2(^-(i-T))". Now, applying the union 
bound over all errors e G B{pn, 0) we conclude that the 
probability that |Ae(C)| is greater than e2(^-(i-'>^))" for any 



pn in {0, 1}" is at least 1 - e-^/'^2". 



G. Proof of Theorem 1 

Letp G [0, 1/2). Let 7 G 




, 1 . Let e > and (5 > 



